Search CORE

16 research outputs found

Reinforcement Learning in Continuous State and Action Space

Author: Gerstner W.
Strösslin T.
Publication venue
Publication date: 26/11/2007
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Robust self-localisation and navigation based on hippocampal place cells

Author: Chavarriaga R.
Gerstner W.
Sheynikhovich D.
Strösslin T.
Publication venue: 'Elsevier BV'
Publication date: 21/11/2007
Field of study

A computational model of the hippocampal function in spatial learning is presented. A spatial representation is incrementally acquired during exploration. Visual and self-motion information is fed into a network of rate-coded neurons. A consistent and stable place code emerges by unsupervised Hebbian learning between place- and head direction cells. Based on this representation, goal-oriented navigation is learnt by applying a reward-based learning mechanism between the hippocampus and nucleus accumbens. The model, validated on a real and simulated robot, successfully localises itself by recalibrating its path integrator using visual input. A navigation map is learnt after about 20 trials, comparable to rats in the water maze. In contrast to previous works, this system processes realistic visual input. No compass is needed for localisation and the reward-based learning mechanism extends discrete navigation models to continuous space. The model reproduces experimental findings and suggests several neurophysiological and behavioural predictions in the rat. (c) 2005 Elsevier Ltd

Infoscience - École polytechnique fédérale de Lausanne

Modelling Path Integrator Recalibration Using Hippocampal Place cells

Author: Chavarriaga R.
Gerstner W.
Sheynikhovich D.
Strösslin T.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/11/2007
Field of study

The firing activities of place cells in the rat hippocampus exhibit strong correlations to the animal's location. External (e.g. visual) as well as internal (proprioceptive and vestibular) sensory information take part in controlling hippocampal place fields. Previously it has been observed that when rats shuttle between a movable origin and a fixed target the hippocampus encodes position in two different frames of reference. This paper presents a new model of hippocampal place cells that explains place coding in multiple reference frames by continuous interaction between visual and self-motion information. The model is tested using a simulated mobile robot in a real-world experimental paradigm

Infoscience - École polytechnique fédérale de Lausanne

Modelling Path Integrator Recalibration Using Hippocampal Place Cells

Author: A. Arleo
A.D. Redish
E. Save
K.M. Gothard
K.M. Gothard
R. Chavarriaga
R.U. Muller
T. Strösslin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref

Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions

Author: A Arleo
B Espiau
C Breazeal
CJ Watkins
DJ Foster
F Chaumette
F Chaumette
F Wörgötter
Florentin Wörgötter
G Tesauro
GJ Gordon
J Peters
J Soechting
J Soechting
M Moussa
M Moussa
M Tamosiunaite
Minija Tamosiunaite
R Dillmann
R Horaud
R Sutton
RJ Williams
RS Sutton
RS Sutton
T Strösslin
Tamim Asfour
V Ruis de Angulo
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

Reinforcement learning methods can be used in robotics applications especially for specific target-oriented problems, for example the reward-based recalibration of goal directed actions. To this end still relatively large and continuous state-action spaces need to be efficiently handled. The goal of this paper is, thus, to develop a novel, rather simple method which uses reinforcement learning with function approximation in conjunction with different reward-strategies for solving such problems. For the testing of our method, we use a four degree-of-freedom reaching problem in 3D-space simulated by a two-joint robot arm system with two DOF each. Function approximation is based on 4D, overlapping kernels (receptive fields) and the state-action space contains about 10,000 of these. Different types of reward structures are being compared, for example, reward-on- touching-only against reward-on-approach. Furthermore, forbidden joint configurations are punished. A continuous action space is used. In spite of a rather large number of states and the continuous action space these reward/punishment strategies allow the system to find a good solution usually within about 20 trials. The efficiency of our method demonstrated in this test scenario suggests that it might be possible to use it on a real robot for problems where mixed rewards can be defined in situations where other types of learning might be difficult

Crossref

Springer - Publisher Connector

Vytautas Magnus University Institutional Repository (VMU ePub)

PubMed Central

Mathematical properties of neuronal TD-rules and differential Hebbian learning: a comparison

Author: A Arleo
A Barto
A Saudargiene
AH Klopf
AH Klopf
B Porr
B Porr
B Porr
Bernd Porr
Christoph Kolodziejski
CJCH Watkins
CL Hull
CL Hull
DJ Foster
F Wörgötter
Florentin Wörgötter
GQ Bi
H Markram
IH Witten
JC Magee
JD Miller
JL Krichmar
JP Pfister
LP Kaelbling
M Tsukamoto
P Dayan
P Dayan
P Dayan
P Manoonpong
P Roberts
PR Montague
PR Montague
R Sutton
RE Suri
RE Suri
RE Suri
RE Suri
RE Suri
RS Sutton
RS Sutton
RS Sutton
RV Florian
SP Singh
T Kulvicius
T Strösslin
TB Boykina
W Gerstner
W Schultz
W Schultz
Y Humeau
Publication venue: Springer-Verlag
Publication date: 01/01/2008
Field of study

A confusingly wide variety of temporally asymmetric learning rules exists related to reinforcement learning and/or to spike-timing dependent plasticity, many of which look exceedingly similar, while displaying strongly different behavior. These rules often find their use in control tasks, for example in robotics and for this rigorous convergence and numerical stability is required. The goal of this article is to review these rules and compare them to provide a better overview over their different properties. Two main classes will be discussed: temporal difference (TD) rules and correlation based (differential hebbian) rules and some transition cases. In general we will focus on neuronal implementations with changeable synaptic weights and a time-continuous representation of activity. In a machine learning (non-neuronal) context, for TD-learning a solid mathematical theory has existed since several years. This can partly be transfered to a neuronal framework, too. On the other hand, only now a more complete theory has also emerged for differential Hebb rules. In general rules differ by their convergence conditions and their numerical stability, which can lead to very undesirable behavior, when wanting to apply them. For TD, convergence can be enforced with a certain output condition assuring that the δ-error drops on average to zero (output control). Correlation based rules, on the other hand, converge when one input drops to zero (input control). Temporally asymmetric learning rules treat situations where incoming stimuli follow each other in time. Thus, it is necessary to remember the first stimulus to be able to relate it to the later occurring second one. To this end different types of so-called eligibility traces are being used by these two different types of rules. This aspect leads again to different properties of TD and differential Hebbian learning as discussed here. Thus, this paper, while also presenting several novel mathematical results, is mainly meant to provide a road map through the different neuronally emulated temporal asymmetrical learning rules and their behavior to provide some guidance for possible applications

Crossref

Springer - Publisher Connector

PubMed Central

Enlighten

Odor supported place cell model and goal navigation in rodents

Author: A. Arleo
A. Arleo
A. D. Ekstrom
A. J. Hill
A. S. Etienne
A. S. Etienne
A. S. Etienne
B. Diekmann
B. L. McNaughton
B. McNaughton
B. Takács
C. Barnes
C. Barry
C. G. Atkeson
C. Hölscher
C. W. Harley
D. Eilam
D. G. Wallace
D. G. Wallace
D. G. Wallace
D. J. Foster
D. J. Hines
D. S. Touretzky
D. Yoganarasimha
E. J. Markus
E. Markus
E. Save
F. Nemati
F. P. Battaglia
F. Sargolini
Florentin Wörgötter
G. E. Carvell
H. Eichenbaum
H. Maaswinkel
H. Tanila
I. Q. Whishaw
J. Calton
J. J. Knierim
J. J. Knierim
J. L. Krichmar
J. O’Keefe
J. O’Keefe
J. O’Keefe
J. O’Keefe
J. O’Keefe
J. Prados
J. S. Taube
J. S. Taube
James Ainge
K. Balakrishnan
K. Jeffery
K. Jeffery
M. A. Brown
M. A. Wilson
M. Franzius
M. L. Shapiro
M. L. Shapiro
M. Recce
Minija Tamosiunaite
N. Brunel
P. A. Dudchenko
P. E. Sharp
P. Gaussier
P. Lavenex
P. Lavenex
P. Lavenex
Paul Dudchenko
R. A. Russell
R. Morris
R. Morris
R. Sutton
R. U. Muller
R. U. Muller
R. U. Muller
T. Hafting
T. Hartley
T. S. Collett
T. Strösslin
Tomas Kulvicius
W. T. Tomlinson
Publication venue: Springer US
Publication date: 01/01/2008
Field of study

Experiments with rodents demonstrate that visual cues play an important role in the control of hippocampal place cells and spatial navigation. Nevertheless, rats may also rely on auditory, olfactory and somatosensory stimuli for orientation. It is also known that rats can track odors or self-generated scent marks to find a food source. Here we model odor supported place cells by using a simple feed-forward network and analyze the impact of olfactory cues on place cell formation and spatial navigation. The obtained place cells are used to solve a goal navigation task by a novel mechanism based on self-marking by odor patches combined with a Q-learning algorithm. We also analyze the impact of place cell remapping on goal directed behavior when switching between two environments. We emphasize the importance of olfactory cues in place cell formation and show that the utility of environmental and self-generated olfactory cues, together with a mixed navigation strategy, improves goal directed navigation

Crossref

Stirling Online Research Repository (RIOXX)

Springer - Publisher Connector

Vytautas Magnus University Institutional Repository (VMU ePub)

PubMed Central

Enlighten

Stirling Online Research Repository

University of Southern Denmark Research Output

University of St. Andrews - Pure

St Andrews Research Repository

Spatial Representation and Navigation in a Bio-inspired Robot

A biologically inspired computational model of rodent repre-sentation?based (locale) navigation is presented. The model combines visual input in the form of realistic two dimensional grey-scale images and odometer signals to drive the firing of simulated place and head direction cells via Hebbian synapses. The space representation is built incrementally and on-line without any prior information about the environment and consists of a large population of location-sensitive units (place cells) with overlapping receptive fields. Goal navigation is performed using reinforcement learning in continuous state and action spaces, where the state space is represented by population activity of the place cells. The model is able to reproduce a number of behavioral and neuro-physiological data on rodents. Performance of the model was tested on both simulated and real mobile Khepera robots in a set of behavioral tasks and is comparable to the performance of animals in similar tasks

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Modelling Path Integrator Recalibration Using

Author: D. Sheynikhovich
R. Chavarriaga
T. Strösslin
W. Gerstner
Publication venue
Publication date
Field of study

Abstract. The firing activities of place cells in the rat hippocampus exhibit strong correlations to the animal’s location. External (e.g. visual) as well as internal (proprioceptive and vestibular) sensory information take part in controlling hippocampal place fields. Previously it has been observed that when rats shuttle between a movable origin and a fixed target the hippocampus encodes position in two different frames of reference. This paper presents a new model of hippocampal place cells that explains place coding in multiple reference frames by continuous interaction between visual and self-motion information. The model is tested using a simulated mobile robot in a real-world experimental paradigm.

CiteSeerX

A multi-agent cooperative reinforcement learning model using a hierarchy of consultants, tutors and workers

Author: AG Barto
C Watkins
C Yong
DW Jiang
G Erus
M Ghavamzadeh
MK Gunady
MR Lee
N Ono
PT Tosic
R Sutton
RS Sutton
T Strösslin
TG Dietterich
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref